A reinforcement learning approach to stochastic business games
نویسندگان
چکیده
The Internet revolution has resulted in increased competition among providers of goods and services to lure customers by tearing down the barriers of time and distance. For example, a home buyer shopping for a mortgage loan through the Internet is now a potential customer for a large number of lending institutions throughout the world. The lenders (players, in generic game theory nomenclature) seeking to capture this customer are involved in a nonzero-sum stochastic game. Stochastic games are among the least studied and understood of the management science problems, and no computationally tractable solution technique is available for multi-player nonzero-sum stochastic games. We now develop a computer-simulation-based machine learning algorithm that can be used to solve nonzero-sum stochastic game problems that are modeled as competitive Markov decision processes. The methodology based on this algorithm is implemented on a supply chain inventory planning problem with a limited state space. The equilibrium reward obtained from the stochastic game problem is compared with a logical upper bound obtained from the corresponding Markov decision problem in which a single decision maker (player) is substituted for all the competing players in the game. Several numerical versions of the problem are studied to assess the performance of the methodology. The results obtained from our methodology for the inventory planning problems are within 0.8% of the upper bound.
منابع مشابه
Learning in Average Reward Stochastic Games A Reinforcement Learning (Nash-R) Algorithm for Average Reward Irreducible Stochastic Games
A large class of sequential decision making problems under uncertainty with multiple competing decision makers can be modeled as stochastic games. It can be considered that the stochastic games are multiplayer extensions of Markov decision processes (MDPs). In this paper, we develop a reinforcement learning algorithm to obtain average reward equilibrium for irreducible stochastic games. In our ...
متن کاملMultiagent Reinforcement Learning in Stochastic Games
We adopt stochastic games as a general framework for dynamic noncooperative systems. This framework provides a way of describing the dynamic interactions of agents in terms of individuals' Markov decision processes. By studying this framework, we go beyond the common practice in the study of learning in games, which primarily focus on repeated games or extensive-form games. For stochastic games...
متن کاملMulitagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces
We investigate the learning problem in stochastic games with continuous action spaces. We focus on repeated normal form games, and discuss issues in modelling mixed strategies and adapting learning algorithms in finite-action games to the continuous-action domain. We applied variable resolution techniques to two simple multi-agent reinforcement learning algorithms PHC and MinimaxQ. Preliminary ...
متن کاملA Near-Optimal Poly-Time Algorithm for Learning a class of Stochastic Games
We present a new algorithm for polynomial time learning of near optimal behavior in stochastic games. This algorithm incorporates and integrates important recent results of Kearns and Singh [ 1998] in reinforcement learning and of Monderer and Tennenholtz [1997] in repeated games. In stochastic games we face an exploration vs. exploitation dilemma more complex than in Markov decision processes....
متن کاملJoint Learning in Stochastic Games: Playing Coordination Games Within Coalitions
Despite the progress in multiagent reinforcement learning via formalisms based on stochastic games, these have difficulties coping with a high number of agents due to the combinatorial explosion in the number of joint actions. One possible way to reduce the complexity of the problem is to let agents form groups of limited size so that the number of the joint actions is reduced. This paper inves...
متن کامل